<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>HTML</title>
  <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" />
  <script src="../toc.js" type="text/javascript">



  //<![CDATA[

   <!-- empty -->
  //]]>
  </script>
  <script type="text/javascript">




  /* <![CDATA[ */
    (function() {
        var s = document.createElement("script"), t = document.getElementsByTagName("script")[0];
        s.type = "text/javascript";
        s.async = true;
        s.src = "http://api.flattr.com/js/0.6/load.js?mode=auto";
        t.parentNode.insertBefore(s, t);
    })();
  /* ]]> */
  </script>
  <style type="text/css">
/*<![CDATA[*/

  body { counter-reset: chapter 11; }

  a.sgc-1 {display:none;}
  /*]]>*/
  </style>
</head>

<body>
  <div class="chapter">
    <h1 class="en" id="heading_id_2">HTML</h1>

    <h1 class="zh" id="heading_id_2">关于HTML</h1>
  </div>

  <div class="preface">
    <p class="en">The Web Server was originally created to serve HTML documents. Now it is used to serve all sorts of documents as well as data of different kinds. Nevertheless, HTML is still the main document type delivered over the Web. Go has basic mechanisms for parsing HTML documents, which are covered in this chapter</p>
    <p class="zh">Web服务器的建立最开始是用来提供HTML文件服务的。现在它能为各种类型的文档和各种不同类型的数据提供服务。然而，HTML依然是互联网网络中传递的主要文档类型。Go有一套基本机制来解析HTML，本章主要阐述此内容。</p>
  </div>

  <div class="generate_from_h2" id="generated-toc"></div>

  <h2 class="en" id="heading_id_3">Introduction</h2>

  <h2 class="zh" id="heading_id_3">介绍</h2>

  <p class="en">The Web was originally created to serve HTML documents. Now it is used to serve all sorts of documents as well as data of dirrent kinds. Nevertheless, HTML is still the main document type delivered over the Web</p>

  <p class="zh">Web服务器的建立最开始是用来提供HTML文件服务的。现在它为各种类型的文档和各种不同类型的数据提供服务。然而，HTML仍然是互联网网络中传递的主要文档类型。</p>

  <p class="en">HTML has been through a large number of versions, and HTML 5 is currently under development. There have also been many "vendor" versions of HTML, introducing tags that never made it into standards.</p>

  <p class="zh">HTML经历了大量的版本变迁，HTML5目前还在开发阶段。此外出现不少“独立供应商”版的HTML，但引入的标签从来没有做成标准。</p>

  <p class="en">HTML is simple enough to be edited by hand. Consequently, many HTML documents are "ill formed", not following the syntax of the language. HTML parsers generally are not very strict, and will accept many "illegal" documents.</p>

  <p class="zh">HTML足够简单，以至于可以纯手工编写。因此，许多HTML文件格式不规范，没有遵守标准准则的语法。HTML解析器通常也不是很严格，而且能接受大多数格式“不严格”的文件。 

</p>

  <p class="en">There wasn't much in earlier versions of Go about handling HTML documents - basically, just a tokenizer. The incomplete nature of the package has led to its removal for Go 1. It can still be found in the <code>exp</code> (experimental) package if you really need it. No doubt some improved form will become available in a later version of Go, and then it will be added back into this book.</p>

  <p class="zh">在早期版本的Go没有太多关于处理HTML文件的细节--基本上只是一个分词器。不完整的原始包在Go 1的版本中已移除。如果你真的需要它，仍然可以在<code>exp</code>(试验)包中找到它。 毫无疑问，Go未来版本在这方面会有一些改进的地方，那么到时将会添加到本书中。 

</p>

  <p class="en">There is limited support for HTML in the XML package, discussed in the next chapter.</p>

  <p class="zh">在XML包中对HTML的支持是有限的，在下一章将会讨论。 </p>
  <!--

  <p class='en'>Go has basic parsing mechanisms based on a tokeniser.
  This allows you to process HTML documents as they are read,
  but if you want to, say, build a parse tree, then you have to do that
  yourself.</p>
  <p class='zh'>Go的基本解析机制是基于分词器。这允许您边读取HTML文档边处理,
但是如果你想建立一个解析树,然后你必须靠自己这样做。</p>
  <p class='en'>The state of this package is currently marked as <em>incomplete</em>
  so it will change over time.</p>
<p class='zh'>这个包的状态还是标记<em>未完成</em>所以后面会有变化</p>

<h2 class="en"> Tokenizer </h2>
<h2 class="zh"> Tokenizer </h2>
<p class="en">
  The <code>html</code> implements a basic tokenizer that can 
  used to parse HTML. The following program reads a file of HTML text
  and prints information about the text tokens and the tags:
  <pre><code><font color="black">
<font color="firebrick">/* Read HTML
 */
</font>
<font color="purple">package</font> main

<font color="purple">import</font> (
        <font color="DarkRed">"fmt"</font>
        <font color="DarkRed">"html"</font>
        <font color="DarkRed">"io/ioutil"</font>
        <font color="DarkRed">"os"</font>
        <font color="DarkRed">"strings"</font>
)

<font color="purple">func</font> main() {
        <font color="purple">if</font> len(os.<font color="blue">Args</font>) != 2 {
                fmt.<font color="blue">Println</font>(<font color="DarkRed">"Usage: "</font>, os.<font color="blue">Args</font>[0], <font color="DarkRed">"file"</font>)
                os.<font color="blue">Exit</font>(1)
        }
        file := os.<font color="blue">Args</font>[1]
        bytes, err := ioutil.<font color="blue">ReadFile</font>(file)
        checkError(err)
        r := strings.<font color="blue">NewReader</font>(string(bytes))

        z := html.<font color="blue">NewTokenizer</font>(r)

        depth := 0
        <font color="purple">for</font> {
                tt := z.<font color="blue">Next</font>()

                <font color="purple">for</font> n := 0; n &lt; depth; n++ {
                        fmt.<font color="blue">Print</font>(<font color="DarkRed">" "</font>)
                }

                <font color="purple">switch</font> tt {
                <font color="purple">case</font> html.<font color="blue">ErrorToken</font>:
                        fmt.<font color="blue">Println</font>(<font color="DarkRed">"Error "</font>, z.<font color="blue">Err</font>().<font color="blue">Error</font>())
                        os.<font color="blue">Exit</font>(0)
                <font color="purple">case</font> html.<font color="blue">TextToken</font>:
                        fmt.<font color="blue">Println</font>(<font color="DarkRed">"Text: \""</font> + z.<font color="blue">Token</font>().<font color="blue">String</font>() + <font color="DarkRed">"\""</font>)
                <font color="purple">case</font> html.<font color="blue">StartTagToken</font>, html.<font color="blue">EndTagToken</font>:
                        fmt.<font color="blue">Println</font>(<font color="DarkRed">"Tag: \""</font> + z.<font color="blue">Token</font>().<font color="blue">String</font>() + <font color="DarkRed">"\""</font>)
                        <font color="purple">if</font> tt == html.<font color="blue">StartTagToken</font> {
                                depth++
                        } <font color="purple">else</font> {
                                depth==
                        }
                }
        }

}

<font color="purple">func</font> checkError(err error) {
        <font color="purple">if</font> err != nil {
                fmt.<font color="blue">Println</font>(<font color="DarkRed">"Fatal error "</font>, err.<font color="blue">Error</font>())
                os.<font color="blue">Exit</font>(1)
        }
}
</font></code></pre>

-->

  <h2 id="heading_id_4" class="en">Conclusion</h2>

  <h2 id="heading_id_4" class="zh">结论</h2>

  <p class="en">There isn't anything to this package at present as it is still under development.</p>

  <p class="zh">目前这个包没有内容，因为它目前仍处于开发阶段。</p>

  <p class="en">Copyright Jan Newmarch, jan@newmarch.name</p>

  <p class="en">If you like this book, please contribute using Flattr <a class="FlattrButton sgc-1" href="http://jan.newmarch.name/go/index.html"></a><br />
  or donate using PayPal</p>

  <form action="https://www.paypal.com/cgi-bin/webscr" method="post">
    <input name="cmd" type="hidden" value="_s-xclick" /> <input name="encrypted" type="hidden" value="-----BEGIN PKCS7-----MIIHLwYJKoZIhvcNAQcEoIIHIDCCBxwCAQExggEwMIIBLAIBADCBlDCBjjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRYwFAYDVQQHEw1Nb3VudGFpbiBWaWV3MRQwEgYDVQQKEwtQYXlQYWwgSW5jLjETMBEGA1UECxQKbGl2ZV9jZXJ0czERMA8GA1UEAxQIbGl2ZV9hcGkxHDAaBgkqhkiG9w0BCQEWDXJlQHBheXBhbC5jb20CAQAwDQYJKoZIhvcNAQEBBQAEgYCCw7fVj6fuHxYMvE0PBlURcRgBFb1s4TxTUDgsS6BgkdJPt2GF8NFPNvE/oFvPNY2jBGrXSIkxCr9dFYzraKC8csPASWb0z9l8swwbIHWgrvb5cuaVuLbtRzesh94sqyh9MmZ5U1xcMrMtlw1S60gK5lPbKPsXzcY74brjt44J7jELMAkGBSsOAwIaBQAwgawGCSqGSIb3DQEHATAUBggqhkiG9w0DBwQIAXtre9K+AiWAgYiJVN0CmxAPscp0u0O8R0mD+cNz/Fe3lNIrqqMPplkri20WbbVxhbRwJTjtOxcLMbmSIeC8oWh14aSy9Jptgm1wNlQYADQQUgMnR/qIlYgHmXjJ4C6wZteqNVJn+RKfM/tS008Ola5SJABaGe9BmRSQCjMKqEyqm3Mx2hoLeWMXeyoMaW3Xteg6oIIDhzCCA4MwggLsoAMCAQICAQAwDQYJKoZIhvcNAQEFBQAwgY4xCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJDQTEWMBQGA1UEBxMNTW91bnRhaW4gVmlldzEUMBIGA1UEChMLUGF5UGFsIEluYy4xEzARBgNVBAsUCmxpdmVfY2VydHMxETAPBgNVBAMUCGxpdmVfYXBpMRwwGgYJKoZIhvcNAQkBFg1yZUBwYXlwYWwuY29tMB4XDTA0MDIxMzEwMTMxNVoXDTM1MDIxMzEwMTMxNVowgY4xCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJDQTEWMBQGA1UEBxMNTW91bnRhaW4gVmlldzEUMBIGA1UEChMLUGF5UGFsIEluYy4xEzARBgNVBAsUCmxpdmVfY2VydHMxETAPBgNVBAMUCGxpdmVfYXBpMRwwGgYJKoZIhvcNAQkBFg1yZUBwYXlwYWwuY29tMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDBR07d/ETMS1ycjtkpkvjXZe9k+6CieLuLsPumsJ7QC1odNz3sJiCbs2wC0nLE0uLGaEtXynIgRqIddYCHx88pb5HTXv4SZeuv0Rqq4+axW9PLAAATU8w04qqjaSXgbGLP3NmohqM6bV9kZZwZLR/klDaQGo1u9uDb9lr4Yn+rBQIDAQABo4HuMIHrMB0GA1UdDgQWBBSWn3y7xm8XvVk/UtcKG+wQ1mSUazCBuwYDVR0jBIGzMIGwgBSWn3y7xm8XvVk/UtcKG+wQ1mSUa6GBlKSBkTCBjjELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAkNBMRYwFAYDVQQHEw1Nb3VudGFpbiBWaWV3MRQwEgYDVQQKEwtQYXlQYWwgSW5jLjETMBEGA1UECxQKbGl2ZV9jZXJ0czERMA8GA1UEAxQIbGl2ZV9hcGkxHDAaBgkqhkiG9w0BCQEWDXJlQHBheXBhbC5jb22CAQAwDAYDVR0TBAUwAwEB/zANBgkqhkiG9w0BAQUFAAOBgQCBXzpWmoBa5e9fo6ujionW1hUhPkOBakTr3YCDjbYfvJEiv/2P+IobhOGJr85+XHhN0v4gUkEDI8r2/rNk1m0GA8HKddvTjyGw/XqXa+LSTlDYkqI8OwR8GEYj4efEtcRpRYBxV8KxAW93YDWzFGvruKnnLbDAF6VR5w/cCMn5hzGCAZowggGWAgEBMIGUMIGOMQswCQYDVQQGEwJVUzELMAkGA1UECBMCQ0ExFjAUBgNVBAcTDU1vdW50YWluIFZpZXcxFDASBgNVBAoTC1BheVBhbCBJbmMuMRMwEQYDVQQLFApsaXZlX2NlcnRzMREwDwYDVQQDFAhsaXZlX2FwaTEcMBoGCSqGSIb3DQEJARYNcmVAcGF5cGFsLmNvbQIBADAJBgUrDgMCGgUAoF0wGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMTEwNTAyMDcwNzQ1WjAjBgkqhkiG9w0BCQQxFgQUgvHyq74JT8DnmViqEqG5KpIW0cAwDQYJKoZIhvcNAQEBBQAEgYAzycmlaZMZjkmYniVBUVTQeywigBo+80toDP2g9+yCzO4mG1Abmfcr/S1XdT8djFA9w37F+F+nSkP857evscUhns30c9wYuPoiNudkJMOkYegqyq+EI4AMNGPuQNZ+4vznmqTgFTn9iQjONC8NGQ/0GuCCQ/AqJZs/0ZiWivlPhA==-----END PKCS7----- " /> <input alt="PayPal - The safer, easier way to pay online." border="0" name="submit" src="https://www.paypalobjects.com/WEBSCR-640-20110401-1/en_AU/i/btn/btn_donateCC_LG.gif" type="image" /> <img alt="" border="0" height="1" src="https://www.paypalobjects.com/WEBSCR-640-20110401-1/en_AU/i/scr/pixel.gif" width="1" />
  </form>
</body>
</html>
